Dataset statistics
| Number of variables | 23 |
|---|---|
| Number of observations | 1000000 |
| Missing cells | 926672 |
| Missing cells (%) | 4.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 58.2 MiB |
| Average record size in memory | 61.0 B |
Variable types
| CAT | 15 |
|---|---|
| NUM | 8 |
Dataset
| Description | Este reporte fue generado con solo un millón de observaciones (1.90% del total). |
|---|---|
| URL | http://international.ipums.org/ |
| Copyright | (c) IPUMS International 2020 |
perwt is highly correlated with hhwt | High correlation |
hhwt is highly correlated with perwt | High correlation |
year is highly correlated with country and 1 other fields | High correlation |
country is highly correlated with year and 1 other fields | High correlation |
sample is highly correlated with country and 1 other fields | High correlation |
edattaind is highly correlated with edattain | High correlation |
edattain is highly correlated with edattaind | High correlation |
empstatd is highly correlated with empstat | High correlation |
empstat is highly correlated with empstatd | High correlation |
internet has 179331 (17.9%) missing values | Missing |
age has 52174 (5.2%) missing values | Missing |
race has 478178 (47.8%) missing values | Missing |
indig has 216989 (21.7%) missing values | Missing |
df_index has unique values | Unique |
Reproduction
| Analysis started | 2020-11-17 18:39:20.657597 |
|---|---|
| Analysis finished | 2020-11-17 18:40:38.718288 |
| Duration | 1 minute and 18.06 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 1000000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26284932.2 |
|---|---|
| Minimum | 121 |
| Maximum | 52546643 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 121 |
|---|---|
| 5-th percentile | 2628396 |
| Q1 | 13142317 |
| median | 26304903.5 |
| Q3 | 39417358.75 |
| 95-th percentile | 49911590.75 |
| Maximum | 52546643 |
| Range | 52546522 |
| Interquartile range (IQR) | 26275041.75 |
Descriptive statistics
| Standard deviation | 15167175.84 |
|---|---|
| Coefficient of variation (CV) | 0.5770292928 |
| Kurtosis | -1.200299036 |
| Mean | 26284932.2 |
| Median Absolute Deviation (MAD) | 13138195 |
| Skewness | -0.001385719184 |
| Sum | 2.62849322e+13 |
| Variance | 2.300432229e+14 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 33558526 | 1 | < 0.1% | |
| 11021623 | 1 | < 0.1% | |
| 5331826 | 1 | < 0.1% | |
| 22127469 | 1 | < 0.1% | |
| 22123371 | 1 | < 0.1% | |
| 24222570 | 1 | < 0.1% | |
| 4189598 | 1 | < 0.1% | |
| 24240997 | 1 | < 0.1% | |
| 13757284 | 1 | < 0.1% | |
| 17945443 | 1 | < 0.1% | |
| Other values (999990) | 999990 | > 99.9% |
| Value | Count | Frequency (%) | |
| 121 | 1 | < 0.1% | |
| 200 | 1 | < 0.1% | |
| 230 | 1 | < 0.1% | |
| 343 | 1 | < 0.1% | |
| 375 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 52546643 | 1 | < 0.1% | |
| 52546439 | 1 | < 0.1% | |
| 52546429 | 1 | < 0.1% | |
| 52546387 | 1 | < 0.1% | |
| 52546384 | 1 | < 0.1% |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 977.3 KiB |
| brazil | |
|---|---|
| mexico | |
| colombia | |
| argentina | |
| peru | |
| Other values (11) |
| Value | Count | Frequency (%) | |
| brazil | 392607 | 39.3% | |
| mexico | 216252 | 21.6% | |
| colombia | 76130 | 7.6% | |
| argentina | 75268 | 7.5% | |
| peru | 52196 | 5.2% | |
| venezuela | 43727 | 4.4% | |
| chile | 28496 | 2.8% | |
| ecuador | 27605 | 2.8% | |
| dominican republic | 17865 | 1.8% | |
| haiti | 16281 | 1.6% | |
| Other values (6) | 53573 | 5.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 6 |
| Mean length | 6.749365 |
| Min length | 4 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.9 KiB |
| 2010 | |
|---|---|
| 2015 | |
| 2005 | |
| 2007 | |
| 2001 | |
| Other values (3) |
| Value | Count | Frequency (%) | |
| 2010 | 519814 | 52.0% | |
| 2015 | 216252 | 21.6% | |
| 2005 | 86102 | 8.6% | |
| 2007 | 63170 | 6.3% | |
| 2001 | 55379 | 5.5% | |
| 2002 | 28496 | 2.8% | |
| 2003 | 16281 | 1.6% | |
| 2011 | 14506 | 1.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 977.3 KiB |
| brazil 2010 | |
|---|---|
| mexico 2015 | |
| colombia 2005 | |
| argentina 2010 | |
| peru 2007 | |
| Other values (11) |
| Value | Count | Frequency (%) | |
| brazil 2010 | 392607 | 39.3% | |
| mexico 2015 | 216252 | 21.6% | |
| colombia 2005 | 76130 | 7.6% | |
| argentina 2010 | 75268 | 7.5% | |
| peru 2007 | 52196 | 5.2% | |
| venezuela 2001 | 43727 | 4.4% | |
| chile 2002 | 28496 | 2.8% | |
| ecuador 2010 | 27605 | 2.8% | |
| dominican republic 2010 | 17865 | 1.8% | |
| haiti 2003 | 16281 | 1.6% | |
| Other values (6) | 53573 | 5.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 23 |
|---|---|
| Median length | 11 |
| Mean length | 11.749365 |
| Min length | 9 |
serial
Real number (ℝ≥0)
| Distinct | 899049 |
|---|---|
| Distinct (%) | 89.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1618102350 |
|---|---|
| Minimum | 1000 |
| Maximum | 6192502000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 46236900 |
| Q1 | 300343000.8 |
| median | 915235001 |
| Q3 | 2507746500 |
| 95-th percentile | 5334000650 |
| Maximum | 6192502000 |
| Range | 6192501000 |
| Interquartile range (IQR) | 2207403499 |
Descriptive statistics
| Standard deviation | 1676355600 |
|---|---|
| Coefficient of variation (CV) | 1.036000968 |
| Kurtosis | 0.2089770035 |
| Mean | 1618102350 |
| Median Absolute Deviation (MAD) | 782556500.5 |
| Skewness | 1.140027561 |
| Sum | 1.61810235e+15 |
| Variance | 2.810168099e+18 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 13350000 | 6 | < 0.1% | |
| 45449000 | 6 | < 0.1% | |
| 425223001 | 6 | < 0.1% | |
| 54600000 | 6 | < 0.1% | |
| 74039000 | 6 | < 0.1% | |
| 6876000 | 6 | < 0.1% | |
| 2435000 | 6 | < 0.1% | |
| 42452000 | 6 | < 0.1% | |
| 46315000 | 6 | < 0.1% | |
| 22929001 | 5 | < 0.1% | |
| Other values (899039) | 999941 | > 99.9% |
| Value | Count | Frequency (%) | |
| 1000 | 1 | < 0.1% | |
| 3000 | 1 | < 0.1% | |
| 4000 | 1 | < 0.1% | |
| 5001 | 1 | < 0.1% | |
| 8000 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6192502000 | 1 | < 0.1% | |
| 6192496000 | 1 | < 0.1% | |
| 6192463000 | 1 | < 0.1% | |
| 6192459000 | 1 | < 0.1% | |
| 6192449000 | 1 | < 0.1% |
persons
Real number (ℝ≥0)
| Distinct | 40 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.662609 |
|---|---|
| Minimum | 1 |
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 976.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 9 |
| Maximum | 50 |
| Range | 49 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.329983636 |
|---|---|
| Coefficient of variation (CV) | 0.4997167113 |
| Kurtosis | 7.948146601 |
| Mean | 4.662609 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.737270608 |
| Sum | 4662609 |
| Variance | 5.428823742 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 4 | 227561 | 22.8% | |
| 3 | 178336 | 17.8% | |
| 5 | 176208 | 17.6% | |
| 6 | 111278 | 11.1% | |
| 2 | 105235 | 10.5% | |
| 7 | 69180 | 6.9% | |
| 8 | 37809 | 3.8% | |
| 1 | 35276 | 3.5% | |
| 9 | 22469 | 2.2% | |
| 10 | 14972 | 1.5% | |
| Other values (30) | 21676 | 2.2% |
| Value | Count | Frequency (%) | |
| 1 | 35276 | 3.5% | |
| 2 | 105235 | 10.5% | |
| 3 | 178336 | 17.8% | |
| 4 | 227561 | 22.8% | |
| 5 | 176208 | 17.6% |
| Value | Count | Frequency (%) | |
| 50 | 1 | < 0.1% | |
| 44 | 1 | < 0.1% | |
| 43 | 1 | < 0.1% | |
| 42 | 4 | < 0.1% | |
| 40 | 3 | < 0.1% |
| Distinct | 6175 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.8261696 |
|---|---|
| Minimum | 0 |
| Maximum | 490 |
| Zeros | 441 |
| Zeros (%) | < 0.1% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4.64 |
| median | 10 |
| Q3 | 10 |
| 95-th percentile | 22.39 |
| Maximum | 490 |
| Range | 490 |
| Interquartile range (IQR) | 5.36 |
Descriptive statistics
| Standard deviation | 9.38108173 |
|---|---|
| Coefficient of variation (CV) | 0.9547038279 |
| Kurtosis | 123.6488759 |
| Mean | 9.8261696 |
| Median Absolute Deviation (MAD) | 2.55 |
| Skewness | 7.28882983 |
| Sum | 9826169.6 |
| Variance | 88.00469443 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10 | 327279 | 32.7% | |
| 2 | 53766 | 5.4% | |
| 4 | 51667 | 5.2% | |
| 6 | 31752 | 3.2% | |
| 4.64 | 28724 | 2.9% | |
| 8 | 18909 | 1.9% | |
| 12 | 7750 | 0.8% | |
| 14 | 5681 | 0.6% | |
| 16 | 4311 | 0.4% | |
| 18 | 3415 | 0.3% | |
| Other values (6165) | 466746 | 46.7% |
| Value | Count | Frequency (%) | |
| 0 | 441 | < 0.1% | |
| 0.77 | 6 | < 0.1% | |
| 0.83 | 1 | < 0.1% | |
| 0.84 | 11 | < 0.1% | |
| 0.85 | 7 | < 0.1% |
| Value | Count | Frequency (%) | |
| 490 | 3 | < 0.1% | |
| 478 | 1 | < 0.1% | |
| 410 | 3 | < 0.1% | |
| 394 | 1 | < 0.1% | |
| 376 | 2 | < 0.1% |
gq
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| households | |
|---|---|
| other group quarters | 2362 |
| institutions | 1420 |
| group quarters (collective), n.s | 666 |
| 1-person unit created by splitting large household | 441 |
| Value | Count | Frequency (%) | |
| households | 995001 | 99.5% | |
| other group quarters | 2362 | 0.2% | |
| institutions | 1420 | 0.1% | |
| group quarters (collective), n.s | 666 | 0.1% | |
| 1-person unit created by splitting large household | 441 | < 0.1% | |
| unknown/group quarters not identified | 110 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 50 |
|---|---|
| Median length | 10 |
| Mean length | 10.061722 |
| Min length | 10 |
geolev1
Real number (ℝ≥0)
| Distinct | 312 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 261391.6515 |
|---|---|
| Minimum | 32002 |
| Maximum | 862023 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 32002 |
|---|---|
| 5-th percentile | 32038 |
| Q1 | 76031 |
| median | 170005 |
| Q3 | 484015 |
| 95-th percentile | 604025 |
| Maximum | 862023 |
| Range | 830021 |
| Interquartile range (IQR) | 407984 |
Descriptive statistics
| Standard deviation | 234868.0863 |
|---|---|
| Coefficient of variation (CV) | 0.8985294095 |
| Kurtosis | -0.1831824525 |
| Mean | 261391.6515 |
| Median Absolute Deviation (MAD) | 93982 |
| Skewness | 0.9480444751 |
| Sum | 2.613916515e+11 |
| Variance | 5.516301796e+10 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 76035 | 69555 | 7.0% | |
| 76031 | 47966 | 4.8% | |
| 76029 | 29665 | 3.0% | |
| 32006 | 29422 | 2.9% | |
| 76043 | 26089 | 2.6% | |
| 76041 | 24756 | 2.5% | |
| 76033 | 21626 | 2.2% | |
| 484020 | 21178 | 2.1% | |
| 484030 | 19518 | 2.0% | |
| 218009 | 18922 | 1.9% | |
| Other values (302) | 691303 | 69.1% |
| Value | Count | Frequency (%) | |
| 32002 | 5516 | 0.6% | |
| 32006 | 29422 | 2.9% | |
| 32010 | 681 | 0.1% | |
| 32014 | 6230 | 0.6% | |
| 32018 | 1777 | 0.2% |
| Value | Count | Frequency (%) | |
| 862023 | 5762 | 0.6% | |
| 862022 | 1000 | 0.1% | |
| 862021 | 1164 | 0.1% | |
| 862020 | 1877 | 0.2% | |
| 862019 | 1469 | 0.1% |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 179331 |
| Missing (%) | 17.9% |
| Memory size | 976.8 KiB |
| no | |
|---|---|
| niu (not in universe) | |
| yes | |
| unknown | 4053 |
| Value | Count | Frequency (%) | |
| no | 384908 | 38.5% | |
| niu (not in universe) | 271723 | 27.2% | |
| yes | 159985 | 16.0% | |
| unknown | 4053 | 0.4% | |
| (Missing) | 179331 | 17.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 3 |
| Mean length | 7.522318 |
| Min length | 2 |
computer
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| no | |
|---|---|
| yes | |
| niu (not in universe) | 4519 |
| unknown/missing | 3486 |
| Value | Count | Frequency (%) | |
| no | 735566 | 73.6% | |
| yes | 256429 | 25.6% | |
| niu (not in universe) | 4519 | 0.5% | |
| unknown/missing | 3486 | 0.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 2 |
| Mean length | 2.387608 |
| Min length | 2 |
pernum
Real number (ℝ≥0)
| Distinct | 36 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.829162 |
|---|---|
| Minimum | 1 |
| Maximum | 46 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 976.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 46 |
| Range | 45 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.879112532 |
|---|---|
| Coefficient of variation (CV) | 0.6641940378 |
| Kurtosis | 7.18633033 |
| Mean | 2.829162 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.795066969 |
| Sum | 2829162 |
| Variance | 3.531063909 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 278237 | 27.8% | |
| 2 | 243297 | 24.3% | |
| 3 | 190857 | 19.1% | |
| 4 | 130982 | 13.1% | |
| 5 | 74145 | 7.4% | |
| 6 | 38625 | 3.9% | |
| 7 | 20123 | 2.0% | |
| 8 | 10316 | 1.0% | |
| 9 | 5680 | 0.6% | |
| 10 | 3198 | 0.3% | |
| Other values (26) | 4540 | 0.5% |
| Value | Count | Frequency (%) | |
| 1 | 278237 | 27.8% | |
| 2 | 243297 | 24.3% | |
| 3 | 190857 | 19.1% | |
| 4 | 130982 | 13.1% | |
| 5 | 74145 | 7.4% |
| Value | Count | Frequency (%) | |
| 46 | 1 | < 0.1% | |
| 39 | 1 | < 0.1% | |
| 37 | 1 | < 0.1% | |
| 35 | 1 | < 0.1% | |
| 32 | 2 | < 0.1% |
| Distinct | 6174 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.83114623 |
|---|---|
| Minimum | 0.77 |
| Maximum | 490 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0.77 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4.65 |
| median | 10 |
| Q3 | 10 |
| 95-th percentile | 22.39 |
| Maximum | 490 |
| Range | 489.23 |
| Interquartile range (IQR) | 5.35 |
Descriptive statistics
| Standard deviation | 9.379103761 |
|---|---|
| Coefficient of variation (CV) | 0.9540193525 |
| Kurtosis | 123.7429894 |
| Mean | 9.83114623 |
| Median Absolute Deviation (MAD) | 2.54 |
| Skewness | 7.292926847 |
| Sum | 9831146.23 |
| Variance | 87.96758735 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10 | 327715 | 32.8% | |
| 2 | 53766 | 5.4% | |
| 4 | 51667 | 5.2% | |
| 6 | 31752 | 3.2% | |
| 4.64 | 28724 | 2.9% | |
| 8 | 18909 | 1.9% | |
| 12 | 7750 | 0.8% | |
| 14 | 5681 | 0.6% | |
| 16 | 4311 | 0.4% | |
| 18 | 3415 | 0.3% | |
| Other values (6164) | 466310 | 46.6% |
| Value | Count | Frequency (%) | |
| 0.77 | 6 | < 0.1% | |
| 0.83 | 1 | < 0.1% | |
| 0.84 | 11 | < 0.1% | |
| 0.85 | 7 | < 0.1% | |
| 0.86 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 490 | 3 | < 0.1% | |
| 478 | 1 | < 0.1% | |
| 410 | 3 | < 0.1% | |
| 394 | 1 | < 0.1% | |
| 376 | 2 | < 0.1% |
| Distinct | 97 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 52174 |
| Missing (%) | 5.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.67749249 |
|---|---|
| Minimum | 3 |
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 15 |
| median | 28 |
| Q3 | 45 |
| 95-th percentile | 70 |
| Maximum | 99 |
| Range | 96 |
| Interquartile range (IQR) | 30 |
Descriptive statistics
| Standard deviation | 20.2215345 |
|---|---|
| Coefficient of variation (CV) | 0.63835654 |
| Kurtosis | -0.3792163021 |
| Mean | 31.67749249 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | 0.6225786121 |
| Sum | 30024751 |
| Variance | 408.9104577 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10 | 20524 | 2.1% | |
| 14 | 20128 | 2.0% | |
| 12 | 20062 | 2.0% | |
| 15 | 20007 | 2.0% | |
| 13 | 19701 | 2.0% | |
| 11 | 19421 | 1.9% | |
| 9 | 19243 | 1.9% | |
| 16 | 18980 | 1.9% | |
| 18 | 18972 | 1.9% | |
| 8 | 18937 | 1.9% | |
| Other values (87) | 751851 | 75.2% | |
| (Missing) | 52174 | 5.2% |
| Value | Count | Frequency (%) | |
| 3 | 18187 | 1.8% | |
| 4 | 18231 | 1.8% | |
| 5 | 18468 | 1.8% | |
| 6 | 18250 | 1.8% | |
| 7 | 18694 | 1.9% |
| Value | Count | Frequency (%) | |
| 99 | 57 | < 0.1% | |
| 98 | 120 | < 0.1% | |
| 97 | 106 | < 0.1% | |
| 96 | 152 | < 0.1% | |
| 95 | 203 | < 0.1% |
sex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.7 KiB |
| female | |
|---|---|
| male |
| Value | Count | Frequency (%) | |
| female | 507829 | 50.8% | |
| male | 492171 | 49.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.015658 |
| Min length | 4 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 478178 |
| Missing (%) | 47.8% |
| Memory size | 977.3 KiB |
| white | |
|---|---|
| brown (brazil) | |
| black | |
| mestizo (indigenous and white) | |
| indigenous | 9068 |
| Other values (8) | 11804 |
| Value | Count | Frequency (%) | |
| white | 261429 | 26.1% | |
| brown (brazil) | 173620 | 17.4% | |
| black | 36413 | 3.6% | |
| mestizo (indigenous and white) | 29488 | 2.9% | |
| indigenous | 9068 | 0.9% | |
| asian | 4004 | 0.4% | |
| unknown | 3078 | 0.3% | |
| montubio (ecuador) | 2045 | 0.2% | |
| afro-ecuadorian | 1146 | 0.1% | |
| mulatto (black and white) | 1086 | 0.1% | |
| Other values (3) | 445 | < 0.1% | |
| (Missing) | 478178 | 47.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 30 |
|---|---|
| Median length | 5 |
| Mean length | 6.455739 |
| Min length | 3 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 216989 |
| Missing (%) | 21.7% |
| Memory size | 976.7 KiB |
| no | |
|---|---|
| yes | |
| unknown | 5220 |
| Value | Count | Frequency (%) | |
| no | 689348 | 68.9% | |
| yes | 88443 | 8.8% | |
| unknown | 5220 | 0.5% | |
| (Missing) | 216989 | 21.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 7 |
|---|---|
| Median length | 2 |
| Mean length | 2.331532 |
| Min length | 2 |
lit
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| yes, literate | |
|---|---|
| no, illiterate | |
| niu (not in universe) | 78998 |
| unknown/missing | 3724 |
| Value | Count | Frequency (%) | |
| yes, literate | 796402 | 79.6% | |
| no, illiterate | 120876 | 12.1% | |
| niu (not in universe) | 78998 | 7.9% | |
| unknown/missing | 3724 | 0.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 13 |
| Mean length | 13.760308 |
| Min length | 13 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| less than primary completed | |
|---|---|
| primary completed | |
| secondary completed | |
| university completed | |
| niu (not in universe) | 36541 |
| Value | Count | Frequency (%) | |
| less than primary completed | 429963 | 43.0% | |
| primary completed | 313201 | 31.3% | |
| secondary completed | 167529 | 16.8% | |
| university completed | 47850 | 4.8% | |
| niu (not in universe) | 36541 | 3.7% | |
| unknown | 4916 | 0.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 27 |
|---|---|
| Median length | 20 |
| Mean length | 21.875242 |
| Min length | 7 |
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 977.3 KiB |
| some primary completed | |
|---|---|
| primary (6 yrs) completed | |
| no schooling | |
| lower secondary general completed | |
| secondary, general track completed | |
| Other values (9) |
| Value | Count | Frequency (%) | |
| some primary completed | 225020 | 22.5% | |
| primary (6 yrs) completed | 182641 | 18.3% | |
| no schooling | 159289 | 15.9% | |
| lower secondary general completed | 111377 | 11.1% | |
| secondary, general track completed | 108353 | 10.8% | |
| university completed | 47850 | 4.8% | |
| primary (4 yrs) completed | 45654 | 4.6% | |
| niu (not in universe) | 36541 | 3.7% | |
| some college completed | 34733 | 3.5% | |
| post-secondary technical education | 20280 | 2.0% | |
| Other values (4) | 28262 | 2.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 36 |
|---|---|
| Median length | 22 |
| Mean length | 23.834486 |
| Min length | 12 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| inactive | |
|---|---|
| employed | |
| niu (not in universe) | |
| unemployed | |
| unknown/missing | 4536 |
| Value | Count | Frequency (%) | |
| inactive | 397926 | 39.8% | |
| employed | 376517 | 37.7% | |
| niu (not in universe) | 180669 | 18.1% | |
| unemployed | 40352 | 4.0% | |
| unknown/missing | 4536 | 0.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 8 |
| Mean length | 10.461153 |
| Min length | 8 |
| Distinct | 26 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 978.0 KiB |
| at work | |
|---|---|
| niu (not in universe) | |
| inactive (not in labor force) | |
| housework | |
| in school | |
| Other values (21) |
| Value | Count | Frequency (%) | |
| at work | 349999 | 35.0% | |
| niu (not in universe) | 180669 | 18.1% | |
| inactive (not in labor force) | 167415 | 16.7% | |
| housework | 97386 | 9.7% | |
| in school | 83505 | 8.4% | |
| inactive, other reasons | 32795 | 3.3% | |
| unemployed, not specified | 31308 | 3.1% | |
| have job, not at work in reference period | 10144 | 1.0% | |
| employed, not specified | 8645 | 0.9% | |
| retirees and living on rent | 5630 | 0.6% | |
| Other values (16) | 32504 | 3.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 42 |
|---|---|
| Median length | 9 |
| Mean length | 15.759659 |
| Min length | 7 |
labforce
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 976.8 KiB |
| yes, in the labor force | |
|---|---|
| no, not in the labor force | |
| niu (not in universe) | |
| unknown | 3621 |
| Value | Count | Frequency (%) | |
| yes, in the labor force | 408514 | 40.9% | |
| no, not in the labor force | 306144 | 30.6% | |
| niu (not in universe) | 281721 | 28.2% | |
| unknown | 3621 | 0.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 26 |
|---|---|
| Median length | 23 |
| Mean length | 23.297054 |
| Min length | 7 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | country | year | sample | serial | persons | hhwt | gq | geolev1 | internet | computer | pernum | perwt | age | sex | race | indig | lit | edattain | edattaind | empstat | empstatd | labforce | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 26329292 | colombia | 2005 | colombia 2005 | 5.881000e+07 | 5 | 6.16 | households | 170005 | NaN | no | 5 | 6.16 | 11.0 | male | white | no | yes, literate | primary completed | primary (5 yrs) completed | inactive | in school | niu (not in universe) |
| 1 | 45863653 | nicaragua | 2005 | nicaragua 2005 | 2.634000e+07 | 6 | 10.00 | households | 558025 | no | no | 1 | 10.00 | 53.0 | female | NaN | no | yes, literate | less than primary completed | some primary completed | employed | at work | yes, in the labor force |
| 2 | 51297159 | venezuela | 2001 | venezuela 2001 | 2.886910e+08 | 7 | 10.00 | households | 862013 | no | no | 1 | 10.00 | 52.0 | female | NaN | NaN | no, illiterate | less than primary completed | no schooling | unknown/missing | unknown/missing | unknown |
| 3 | 2009550 | argentina | 2010 | argentina 2010 | 6.480550e+08 | 5 | 10.00 | households | 32014 | NaN | no | 3 | 10.00 | 19.0 | male | NaN | NaN | yes, literate | primary completed | primary (6 yrs) completed | employed | at work | yes, in the labor force |
| 4 | 17220209 | brazil | 2010 | brazil 2010 | 3.834011e+09 | 5 | 3.26 | households | 76035 | yes | yes | 4 | 3.26 | 17.0 | female | white | no | yes, literate | secondary completed | secondary, general track completed | inactive | inactive (not in labor force) | no, not in the labor force |
| 5 | 45452679 | mexico | 2015 | mexico 2015 | 2.853132e+09 | 5 | 4.00 | households | 484031 | no | no | 4 | 4.00 | 26.0 | female | NaN | yes | yes, literate | university completed | university completed | employed | at work | yes, in the labor force |
| 6 | 4182065 | brazil | 2010 | brazil 2010 | 6.362100e+07 | 7 | 5.63 | households | 76012 | niu (not in universe) | no | 5 | 5.63 | 22.0 | male | white | no | yes, literate | primary completed | primary (6 yrs) completed | inactive | inactive (not in labor force) | no, not in the labor force |
| 7 | 761071 | argentina | 2010 | argentina 2010 | 2.650540e+08 | 2 | 10.00 | households | 32006 | NaN | no | 2 | 10.00 | 50.0 | male | NaN | NaN | yes, literate | primary completed | primary (6 yrs) completed | employed | at work | yes, in the labor force |
| 8 | 31054217 | dominican republic | 2010 | dominican republic 2010 | 1.624880e+08 | 6 | 10.00 | households | 214008 | no | no | 6 | 10.00 | 10.0 | male | NaN | NaN | yes, literate | less than primary completed | some primary completed | inactive | in school | niu (not in universe) |
| 9 | 36215848 | mexico | 2015 | mexico 2015 | 4.576560e+08 | 8 | 28.00 | households | 484009 | yes | yes | 8 | 28.00 | 50.0 | female | NaN | no | yes, literate | secondary completed | secondary, general track completed | employed | at work | yes, in the labor force |
Last rows
| df_index | country | year | sample | serial | persons | hhwt | gq | geolev1 | internet | computer | pernum | perwt | age | sex | race | indig | lit | edattain | edattaind | empstat | empstatd | labforce | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 999990 | 34166301 | haiti | 2003 | haiti 2003 | 1.636770e+08 | 8 | 10.00 | households | 332006 | NaN | no | 4 | 10.00 | 12.0 | female | NaN | NaN | no, illiterate | less than primary completed | no schooling | inactive | housework | niu (not in universe) |
| 999991 | 6186480 | brazil | 2010 | brazil 2010 | 5.580270e+08 | 6 | 10.39 | households | 76021 | niu (not in universe) | no | 2 | 10.39 | 7.0 | male | white | no | yes, literate | less than primary completed | some primary completed | niu (not in universe) | niu (not in universe) | niu (not in universe) |
| 999992 | 30320964 | costa rica | 2011 | costa rica 2011 | 5.718100e+07 | 5 | 10.00 | households | 188002 | no | no | 1 | 10.00 | 38.0 | male | white | no | yes, literate | less than primary completed | some primary completed | employed | at work | yes, in the labor force |
| 999993 | 41488326 | mexico | 2015 | mexico 2015 | 1.817698e+09 | 1 | 2.00 | households | 484020 | no | no | 1 | 2.00 | 57.0 | female | NaN | yes | yes, literate | less than primary completed | some primary completed | inactive | housework | no, not in the labor force |
| 999994 | 16790221 | brazil | 2010 | brazil 2010 | 3.696445e+09 | 10 | 2.48 | households | 76035 | niu (not in universe) | no | 1 | 2.48 | 39.0 | male | white | no | no, illiterate | less than primary completed | some primary completed | employed | at work | yes, in the labor force |
| 999995 | 15875195 | brazil | 2010 | brazil 2010 | 3.403780e+09 | 4 | 15.96 | households | 76033 | niu (not in universe) | no | 3 | 15.96 | 26.0 | male | brown (brazil) | no | yes, literate | primary completed | lower secondary general completed | employed | at work | yes, in the labor force |
| 999996 | 22126704 | brazil | 2010 | brazil 2010 | 5.387448e+09 | 5 | 11.96 | households | 76043 | no | yes | 3 | 11.96 | NaN | female | black | no | niu (not in universe) | less than primary completed | no schooling | niu (not in universe) | niu (not in universe) | niu (not in universe) |
| 999997 | 25764931 | chile | 2002 | chile 2002 | 3.630960e+08 | 4 | 10.00 | households | 152131 | no | no | 3 | 10.00 | 9.0 | female | NaN | no | no, illiterate | less than primary completed | some primary completed | niu (not in universe) | niu (not in universe) | niu (not in universe) |
| 999998 | 49665403 | el salvador | 2007 | el salvador 2007 | 9.858900e+07 | 5 | 10.00 | households | 222006 | no | no | 5 | 10.00 | 4.0 | male | mestizo (indigenous and white) | no | niu (not in universe) | niu (not in universe) | niu (not in universe) | employed | marginally employed | niu (not in universe) |
| 999999 | 6642712 | brazil | 2010 | brazil 2010 | 6.748320e+08 | 5 | 11.97 | households | 76022 | niu (not in universe) | no | 2 | 11.97 | 30.0 | female | brown (brazil) | no | yes, literate | secondary completed | secondary, general track completed | employed | at work | yes, in the labor force |